An Exploration of Approaches for the Stanford Question Answering Dataset
نویسنده
چکیده
In this paper, we present an exploration of several approaches of varying complexity, novelty and effectiveness applied to solving the reading comprehension problem evaluated by the Stanford Question Answering Dataset (SQuAD), taken from the perspective of a novice practitioner of Deep Neural Networks for Natural Language Processing. Here, we present several models, their mathematical structure, analysis of their implementation and evaluation of their effectiveness in retrieving answers to questions on examples from the SQuAD dataset. In particular, we evaluate (1) a simple sequence attention-mix boundary model, (2) a more complex sequence attention-mix token model and (3) an implementation of a Match-LSTM model for reading comprehension. For each model, we also provide the results of a thorough hyperparameter search and lessons learned from implementation the model in the TensorFlow framework. Our final model achieves an F1 score of 57.7% and an exact match score of 46.3% on the official SQuAD test set.1 We conclude with notes on the training infrastructure we built to effectively support model training and hyperparameter search, and valuable practical lessons learned from the operation of a physical networked computer cluster for neural network training and evaluation.
منابع مشابه
Rolling Deep with the SQuAD: Question Answering
Stanford Question Answering Dataset (SQuAD) is a ”reading comprehension dataset”. It is composed of passages of text, a question about that text, and an answer directly from the text. It allows groups to test their question-answering systems against other submissions as well as a Human Performance benchmark of 82.3 F1, 91.221 EM scores. Our task is to build our own question-answering system usi...
متن کاملBoosting Passage Retrieval through Reuse in Question Answering
Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...
متن کاملExploring Question Understanding and Adaptation in Neural-Network-Based Question Answering
The last several years have seen intensive interest in exploring neural-networkbased models for machine comprehension (MC) and question answering (QA). In this paper, we approach the problems by closely modelling questions in a neural network framework. We first introduce syntactic information to help encode questions. We then view and model different types of questions and the information shar...
متن کاملA simple sequence attention model for machine comprehension
Machine comprehension is an important NLP problem with a number of important applications. Particularly question answering has been attracting a lot of attention on the applied research area. This work explores a simple sequence attention architecture for question answering. In this work the Stanford Question and Answering Dataset (SQuAD) introduced by Rajpurkar et al. (2016) is employed.
متن کاملSQuAD: 100, 000+ Questions for Machine Comprehension of Text
We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on depend...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017